Surprising Study: Less Sleep Improves Test Scores?

A Statistical Mystery

Breaking News!

New Study Shows: Less Sleep = Higher Test Scores!

Study Design:

  • Recruited 6 students with naturally different sleep patterns
  • Tracked each student for 5 nights
  • Recorded their sleep hours and next-day test performance
  • Total: 30 observations (6 students × 5 nights each)

Warning

Note: This is simulated data for teaching purposes only

The shocking result…

The Data: All Students Combined

Each extra hour of sleep reduces scores by 9.8 points!

Wait… That Seems Wrong

Let’s look at each student individually to check if this makes sense.

Maybe there’s something we’re missing?

Student A

Student A: More sleep = Better scores

Student B

Student B: More sleep = Better scores

Student C

Student C: More sleep = Better scores

Student D

Student D: More sleep = Better scores

Student E

Student E: More sleep = Better scores

Student F

Student F: More sleep = Better scores

Hold On…

Every single student shows: More sleep = Better performance

But the aggregate data showed: Less sleep = Better performance!

How is this possible?

Let’s look at that first plot again…

Back to the Aggregate View

Question: How can the overall trend be negative when every individual trend is positive?

Let’s Color by Student

Notice anything?

Connect the Dots

The Mystery: Students with different baseline scores have different sleep patterns!

This is Simpson’s Paradox

  • Each student has their own baseline performance level
  • In our small sample, students who naturally slept more happened to have different baseline scores
  • This creates a confounding variable between sleep and baseline performance
  • The aggregate trend reverses the individual trends
  • Within each student: More sleep helps performance
  • Across students: By chance, baseline scores were correlated with natural sleep patterns

So How Do Students Differ?

When we have repeated measures, students can differ in two fundamental ways:

  1. Baseline performance (where their line starts) → Random Intercepts
  2. Response to sleep (how steep their line is) → Random Slopes

Let’s build up the model step by step to see how we capture these differences…

Way 1: Different Baselines

Random Intercept Model: lmer(score ~ sleep + (1 | student))

This model assumes students differ in baseline performance, but respond the same way to sleep.

Each student starts at a different level, but model assumes sleep affects everyone equally

Way 2: Different Responses

Random Slope Model: lmer(score ~ sleep + (0 + sleep | student))

This model assumes students start at the same baseline, but respond differently to sleep.

Some students are more “sleep-sensitive” (steeper slopes) than others

Way 3: Both Together

Random Slopes & Intercepts Model: lmer(score ~ sleep + (sleep | student))

This model allows students to differ in both baseline AND their response to sleep.

This captures both sources of variation we see in the data

Why Not Always Use the Full Model?

The full model (sleep | student) seems best, so why use simpler models?

  1. Convergence issues - Complex models may fail to fit with small samples or little variation
  2. Overfitting - Estimating too many parameters with limited data
  3. Parsimony - Simpler models are easier to interpret and communicate
  4. Theory-driven choices - Sometimes you know slopes shouldn’t vary (e.g., physical laws)
  5. Data don’t support it - If there’s no slope variation, the model won’t benefit from random slopes

Note

Best practice: Start complex, simplify if needed. Use model comparison (AIC, BIC, likelihood ratio tests) to guide decisions.

The Complete Picture

Our data shows BOTH sources of variation:

Random intercepts (baselines) + Random slopes (sleep sensitivity) = Full model

Wait… What Question Are We Asking?

“Does sleep affect test performance?” is actually THREE different questions!

  1. Between-person: “Do students who generally sleep more score better than students who sleep less?”
  2. Within-person: “When a specific student sleeps more, do they score better?”
  3. Average within-person: “On average across all students, how much does sleeping more help?”

Each question requires a different model and gives a different answer!

Question 1: Between-Person Effect

“Do students who sleep more score better?”

# Aggregate model (between-person only)
model_between <- lm(score ~ sleep_hours)

Answer: NO → This is confounded by individual differences!

Question 2: Within-Person Effects

“When each student sleeps more, do they score better?”

# Look at each individual separately

Answer: YES → Every student benefits! But effects vary (some steep, some flat)

Visualizing the Answer

The dashed line represents the average effect WITHIN students

Why This Matters

The Research Question Determines The Model

  • “Do people who sleep more perform better?” → Between-person (observational, confounded)
  • “Does sleeping more help performance?” → Within-person (causal, what we want!)

The hierarchical model: - Separates within-person from between-person effects - Gives us the average causal effect we care about - Accounts for individual differences - Provides the right answer to “does sleep help?”

Key Lessons

  1. Simpson’s Paradox is real - Aggregate trends can completely reverse individual trends
  2. Ignoring group structure = Wrong conclusions - We’d tell students to sleep less!
  3. Random effects capture reality:
    • Random intercepts = People differ in baseline
    • Random slopes = People differ in how they respond
  4. The fixed effect in a hierarchical model = Average within-group effect
  5. This answers our causal question: “Does sleep help performance?” → YES, ~3.5 points/hour on average
  6. The model formula matters:
    • lm(y ~ x) - Between-person only (WRONG here)
    • lmer(y ~ x + (1 | id)) - Different baselines
    • lmer(y ~ x + (x | id)) - Different baselines + responses (BEST)

Questions?

The moral of the story: When you have repeated measures, you MUST account for individual differences, or you might conclude the exact opposite of the truth!